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ABSTRACT 

/ 

f 

A  real-4 im*  neural  network  model  is  described  in  which  reinforcement  helps  to  focus 
attention  upon  and  organise  learning  of  those  environmental  events  and  contingencies  that 
have  predicted  behavioral  success  in  the  past.  Computer  simulations  of  the  model  repro¬ 
duce  properties  of  attentional  blocking,  inverted-U  in  learning  as  a  function  of  interstim¬ 
ulus  interval,  primary  and  secondary  excitatory  and  inhibitory  conditioning,  anticipatory 
conditioned  responses,  attentional  focussing  by  conditioned  motivational  feedback,  and 
limited  capacity  short  term  memory  processing.  Qualitative  explanations  are  offered  of 
why  conditioned  responses  extinguish  when  a  conditioned  excitor  is  presented  alone,  but 
do  not  extinguish  when  a  conditioned  inhibitor  is  presented  alone.  These  explanations 
invoke  associative  learning  between  sensory  representations  and  drive,  or  emotional,  rep¬ 
resentations  (in  the  form  of  conditioned  reinforcer  and  incentive  motivational  learning), 
between  sensory  representations  and  learned  expectations  of  future  sensory  events,  and 
between  sensory  representations  and  learned  motor  commands.  Drive  representations  are 
organized  in  opponent  positive  and  negative  pairs  (e.g.,  fear  and  relief),  linked  together 
by  recurrent  gated  dipole,  or  READ,  circuits.  Cognitive  modulation  of  conditioning  is 
regulated  by  adaptive  resonance  theory,  or  ART,  circuits  which  control  the  learning  and 
matching  of  expectations,  and  the  match-contingent  reset  of  sensory  short  term  memory. 
Dendritic  spines  are  invoked  to  dissociate  read-in  and  read-out  of  associative  learning  and 
to  thereby  design  a  memory  which  does  not  passively  decay,  does  not  saturate,  and  can  be 
actively  extinguished  by  opponent  interactions. 
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1.  Introduction 

A  key  problem  in  biological  theories  of  intelligence  concerns  the  manner  in  which 
external  events  interact  with  internal  organismic  requirements  to  trigger  learning  processes 
capable  of  focussing  attention  upon  motivationally  desired  goals.  The  results  reported 
herein  further  develop  a  neural  theory  of  learning  and  memory  (Grossberg,  1982,  1987)  in 
which  sensory-cognitive  and  cognitive-reinforcement  circuits  help  to  focus  attention  upon 
and  organize  learning  of  those  environmental  events  that  predict  behavioral  success. 

The  first  set  of  results  (Grossberg  and  Levine,  1987)  describe  computer  simulations 
that  show  how  the  model  reproduces  properties  of  attentional  blocking,  inverted- U  in 
learning  as  a  function  of  interstimulus  interval,  anticipatory  conditioned  responses,  sec¬ 
ondary  reinforcement,  attentional  focussing  by  conditioned  motivational  feedback,  and 
limited  capacity  short-term  memory  processing.  Conditioning  occurs  from  sensory  to  drive 
representations  ("conditioned  reinforcer”  learning),  from  drive  to  sensory  representations 
("incentive  motivational”  learning),  and  from  sensory  to  motor  representations  ("habit” 
learning).  The  conditionable  pathways  contain  long-term  memory  traces  that  obey  a  non- 
Hebbian  associative  law.  The  neural  model  embodies  a  solution  of  two  key  design  problems 
of  conditioning,  the  synchronization  and  persistence  problems.  This  model  of  vertebrate 
learning  has  also  been  compared  with  data  and  models  of  invertebrate  learning.  Pre¬ 
dictions  derived  from  models  of  vertebrate  learning  have  been  compared  with  data  about 
invertebrate  learning,  including  data  from  Aplyai a  about  facilitator  neurons  and  data  from 
Herzniaaend*  about  voltage-dependent  Ca*+  currents. 

In  the  second  set  of  results  i  Grossberg  and  Schmajuk,  1987),  representations  are  ex¬ 
panded  to  include  positive  and  negative  opponent  drive  representations,  as  in  the  oppo- 
nency  between  fear  and  relief  The  expanded  real-time  neural  network  model  is  developed 
to  explain  data  about  the  acquw>t<on  and  extinction  of  conditioned  excitors  and  inhibitors. 
Systematic  computer  simulations  have  been  performed  to  characterize  a  READ  circuit, 
which  joins  together  a  mechanism  /  associative  learning  with  an  opponent  processing 
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circuit,  called  a  recurrent  gated  dipole.  READ  circuit  properties  clarify  how  positive  and 
negative  reinforcers  are  learned  and  extinguished  during  primary  and  secondary  condi¬ 
tioning.  Habituating  chemical  transmitters  within  a  gated  dipole  determine  an  affective 
adaptation  level,  or  context,  against  which  later  events  are  evaluated.  Neutral  OS’s  can 
become  reinforcers  by  being  associated  either  with  direct  activations  or  with  antagonis¬ 
tic  rebounds  within  a  previously  habituated  dipole.  Neural  mechanisms  are  characterized 
whereby  conditioning  can  be  actively  extinguished,  by  a  process  called  opponent  extinction, 
even  if  no  passive  memory  decay  occurs. 

READ  circuit  mechanisms  are  joined  to  mechanisms  for  associative  learning  of  in¬ 
centive  motivation;  for  activating  and  storing  internal  representations  of  sensory  cues  in 
a  limited  capacity  short  term  memory  (STM);  for  learning,  matching,  and  mismatching 
sensory  expectancies,  learning  to  the  enhancement  or  updating  of  STM;  and  for  shift¬ 
ing  the  focus  of  attention  toward  sensory  representations  whose  reinforcement  history  is 
consistent  with  momentary  appetitive  requirements.  This  architecture  has  been  used  to 
explain  conditioning  and  extinction  of  a  conditioned  excitor;  conditioning  and  extinction 
of  a  conditioned  inhibitor;  properties  of  conditioned  inhibition  as  a  “slave”  process  and 
as  a  “comparator”  process,  including  effects  of  pretest  deflation  or  inflation  of  the  condi¬ 
tioning  context,  of  familiar  or  novel  training  or  test  contexts,  of  weak  or  strong  shocks, 
and  of  preconditioning  US- alone  exposures.  The  same  mechanisms  have  also  been  used 
(Groesberg,  1982,  1987)  to  explain  phenomena  such  as  unblocking,  overshadowing,  latent 
inhibition,  superconditioning,  partial  reinforcement  acquisition  effect,  learned  helplessness, 
and  vicious-circle  behavior.  The  theory  clarifies  why  alternative  models  have  been  unable 
to  explain  an  equally  large  data  base. 
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2.  Neural  Network  Macrocircuits 

Two  types  of  macrocircuits  control  learning  within  the  model. 

Sensory-Cognitive  Circuit:  Sensory-cognitive  interactions  in  the  theory  are  carried 
out  by  an  Adaptive  Resonance  Theory  (ART)  circuit  (Carpenter  and  Grossberg,  1985, 
1987a,  1987b;  Grossberg,  1978, 1987).  The  ART  architecture  suggests  how  internal  repre¬ 
sentations  of  sensory  events,  including  conditioned  stimuli  (CS)  and  unconditioned  stimuli 
(US),  can  be  learned  in  stable  fashion  (Figure  1).  Among  the  mechanisms  used  for  stable 
self-organization  of  sensory  recognition  codes  are  top-down  expectations  which  are  matched 
against  bottom-up  sensory  signals.  When  a  mismatch  occurs,  an  arousal  burst  acts  to  reset 
the  sensory  representation  of  all  cues  that  are  currently  being  stored  in  STM.  In  particular, 
representations  with  high  STM  activation  tend  to  become  less  active,  representations  with 
low  STM  activation  tend  to  become  more  active,  and  the  novel  event  which  caused  the 
mismatch  tends  to  be  more  actively  stored  than  it  would  have  been  had  it  been  expected. 

Figure  1 

Cognitive- Reinforcement  Circuit:  Cognitive-reinforcer  interactions  in  the  theory 
are  carried  out  in  the  circuit  described  in  Figure  2.  In  this  circuit,  there  exist  cell  pop¬ 
ulations  that  are  separate  from  sensory  representations  and  related  to  particular  drives 
and  motivational  variables  (Grossberg,  1972,  1987).  Repeated  pairing  of  a  CS  sensory 
representation,  5ej,  with  activation  of  a  drive  representation,  D ,  by  a  reinforcer  causes  the 
modifiable  synapses  connecting  Set  with  D  to  become  strengthened.  Incentive  motivation 
pathways  from  the  drive  representations  to  the  sensory  representations  are  also  assumed 
to  be  conditionable.  These  S  -*  D  —  S  feedback  pathways  shift  the  attentional  focus 
to  the  set  of  previously  reinforced,  motivationally  compatible  cues  (Figure  2).  This  shift 
of  attention  occurs  because  the  sensory  representations,  which  emit  conditioned  reinforcer 
signals  and  receive  incentive  motivation  signals,  compete  among  themselves  for  a  limited 
capacity  short-term  memory  (STM)  via  a  shunting  on-center  off-surround  anatomy.  When 
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incentive  motivational  feedback  signals  are  received  at  the  sensory  representational  field, 
these  signals  can  bias  the  competition  for  STM  activity  towards  motivationally  salient 
cues. 

Figure  2 

S.  Attentions!  Blocking  and  Interstixnulus  Interval 

The  attentional  modulation  of  Pavlovian  conditioning  is  part  of  the  general  problem  of 
how  an  information  processing  system  can  selectively  process  those  environmental  inputs 
that  are  most  important  to  the  current  goals  of  the  system.  A  key  example  is  the  blocking 
paradigm  studied  by  Kaxnin  (1060)  (Figure  3).  First,  a  stimulus  CS\ ,  such  as  a  tone,  is 
presented  several  times,  followed  at  a  given  time  interval  by  an  unconditioned  stimulus 
US,  such  as  electric  shock,  until  a  conditioned  response,  such  as  fear,  develops.  Then  CS\ 
and  another  stimulus  CSj,  such  as  a  light,  are  presented  together,  followed  at  the  same 
time  interval  by  the  US.  Finally,  CS\  is  presented  alone,  not  followed  by  a  US,  and  no 
conditioned  response  occurs. 

Figure  3 

The  blocking  paradigm  suggests  four  key  subproblems  of  the  selective  information 
processing  problem.  These  subproblems  are:  (1)  How  does  the  pairing  of  CS\  with  US 
in  the  first  phase  of  the  blocking  experiment  endow  the  CS\  cue  with  properties  of  a 
conditioned,  or  secondary,  reinforcer?  (2)  How  do  the  reinforcing  properties  of  a  cue  shift 
the  focus  of  attention  towards  its  own  processing?  (3)  How  does  the  limited  capacity  of 
attentional  resources  arise,  so  that  a  shift  of  attention  towards  one  set  of  cues  can  prevent 
other  cues  from  being  attended?  (4)  How  does  withdrawal  of  attention  from  a  cue  prevent 
that  cue  from  entering  into  new  conditioned  relationships? 

The  explanation  of  blocking  also  leads  to  an  explanation  of  the  inverted- U  relationship 
between  strength  of  the  conditioned  response  (measured  in  one  of  several  ways)  and  the 
time  interval  (ISI)  between  conditioned  and  unconditioned  stimuli.  Figure  4  gives  an 
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example  of  experimental  data  on  the  effects  of  ISI  from  studies  of  Smith  et  al.  (1969)  and 
Schneiderman  and  Gormenzano  (1964)  of  the  rabbit  nictitating  membrane  response.  This 
is  noteworthy  because  Sutton  and  Barto  (1981)  previously  stated  that  the  ISI  data  pose  a 
difficulty  for  any  network  with  associative  synapses,  that  is,  synapses  whose  efficacy  changes 
as  a  function  of  the  correlation  between  presynaptic  and  postsynaptic  activities.  They 
argued  that  a  network  with  associative  synapses  should,  to  a  first  approximation,  have  an 
optimal  ISI  of  zero  because  cross-correlation  between  two  stimulus  traces  is  strongest  when 
the  two  stimuli  occur  simultaneously.  To  avoid  this  difficulty,  other  modellers  introduced 
a  delay  in  the  CS  pathway  that  was  equal  to  the  optimal  ISI.  But  such  a  delay  would  delay 
the  CR  by  an  equal  amount,  and  hence  is  incompatible  with  the  so-called  anticipatory 
CR  that  occurs  before  US  onset.  On  this  basis,  Sutton  and  Barto  suggested  a  different 
synaptic  modification  rule  at  the  single-unit  level. 

Figure  4 

Our  simulations,  by  contrast,  reproduce  both  the  ISI  data  and  the  anticipatory  CR 
without  invoking  a  long  delay  in  the  CS  pathway.  Poor  conditioning  with  CS  and  US  si¬ 
multaneous,  or  nearly  so,  is  explained  by  a  mechanism  identical  to  the  blocking  mechanism 
except  that  C Si  is  replaced  by  US  and  CS]  by  CS.  In  both  cases,  the  stimulus  with  more 
motivational  significance  inhibits  the  processing  of  the  stimulus  with  less  motivational  sig¬ 
nificance.  Poor  conditioning  with  CS  and  US  far  apart  in  time  occurs  because  by  the  time 
the  US  arrives,  the  CS  representation  has  decayed  in  short-term  memory  to  a  level  that  is 
below  the  threshold  for  affecting  efficacy  of  the  appropriate  synapses. 

The  answers  to  subproblems  (1)  to  (4)  are  obtained  from  study  of  a  network  which 
includes  modifiable  associative  links  between  sensory  and  drive  representations  (in  both 
directions)  and  competitive  links  between  different  sensory  representations  (Figure  2).  The 
associative  links  do  not  obey  Hebb's  postulate  because  cross-correlation  is  counteracted  by 
decays;  hence,  synaptic  efficacy  can  either  increase  or  decrease  with  paired  presynaptic  and 
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poataynaptic  activities  (Grossberg,  1968,  1969,  1982),  not  just  increase,  as  Hebb  claimed 
(Hebb,  1949).  Such  an  associative  law  has  recently  received  direct  neurophysiological  sup¬ 
port  (Levy,  Brassel,  and  Moore,  1983;  Levy  and  Desmond,  1985;  Rauschecker  and  Singer, 
1979;  Singer,  1983).  The  existence  of  drive  representations  was  derived  from  an  analysis 
of  the  synchronization  problem  (Grossberg,  1971);  that  is,  of  how  a  stable  conditioned 
response  can  develop  even  if  variable  time  lags  occur  between  the  CS  and  the  US.  These 
drive  representations,  separate  from  the  sensory  representations  of  particular  stimuli,  are 
what  Bower  has  called  emotion  nodes  (Bower,  1981;  Bower,  Gilligan,  and  Monteiro,  1981) 
and  Barto,  Sutton,  and  Anderson  ( 1983)  have  called  adaptive  critic  elements.  A  US  uncon¬ 
ditionally  activates  its  drive  representation  if  the  drive  level  is  sufficiently  high.  Repeated 
pairing  of  a  CS  with,  for  example,  a  food  US  causes  pairing  of  stimulation  of  the  CS  sensory 
representation,  denoted  Scs,  with  that  of  the  representation  for  the  hunger  drive,  denoted 
Djj.  The  answer  to  subproblem  (1)  therefore  depends  on  the  strengthening  of  Scs 
synapses  according  to  an  associative  rule. 

Subproblem  (2)  is  answered  using  Du  — *  Scs  incentive  motivational  feedback.  In 
the  blocking  experment,  Scst  i*  enhanced  relative  to  Scs,-  Scs,  thus  tend  to  be 
suppressed  due  to  competition  between  sensory  representations  that  causes  limited  capacity 
of  short  term  memory  storage.  Similarly,  in  the  simultaneous  phase  of  the  ISI  experiment, 
Sus  more  enhanced  than  Scs  •  *°  that  Scs  **  suppressed. 

The  limited  capacity  of  short-term  memory,  which  is  needed  to  answer  subproblem  (3) 
arises  from  limited  capacity  properties  of  a  recurrent  on-center  off-surround  held,  which 
was  originally  derived  to  satisfy  a  more  basic  processing  requirement:  the  ability  to  process 
spatially  distributed  input  patterns  w.thout  irreparably  distorting  these  patterns  due  to 
either  noise  or  saturation  (Elliae  an*l  '  Irossberg,  1975;  Grossberg  and  Levine,  1975).  Figure 
2  schematizes  a  network  with  i-Sa&ie  »eruory- to- drive  and  drive-to-sensory  association 
links  and  recurrent  on-center  <>tf  « irr-v4 nd  links  between  sensory  representations. 

Our  computer  simulations.  r*p<>r*r»i  more  completely  in  Grossberg  and  Levine  (1987), 
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run  through  different  stimulus  conditions  on  the  network  of  Figure  5,  which  is  &  variant  of 
Figure  2  with  three  sensory  representations,  CSU  CS2,  and  US.  For  simplicity,  there  is 
only  one  drive  representation,  D,  in  our  network.  The  US  -*  D  and  D  — >  US  synapses  are 
fixed  at  high  value.  The  CS  — ►  D  and  D  — *  CS  synapses  are  strengthened  by  appearance 
of  the  US  while  the  CS  short  term  memory  representation  is  active.  In  this  variant  of  the 
network,  sensory  representations  are  divided  into  two  successive  stages.  The  activity  x,i 
of  the  tth  first  stage  can  activate  conditioned  reinforcer  pathways,  whereas  the  activity  xt2 
of  the  ith  second  stage  receives  conditioned  incentive  motivational  pathways  from  D ,  and 
can  thereupon  activate  z,i  and  output  motor  pathways. 

The  same  set  of  network  parameters  yielded  both  the  ISI  invert ed-U  curve  in  the  case 
of  only  one  CS  present,  and  blocking  in  the  case  of  two  CS’s.  In  both  cases,  the  CR 
anticipated  the  US. 

Figure  5 

Our  simulated  ISI  curves  (Figure  6)  were  qualitatively  compatible  with  experimental 
data  on  the  rabbit's  conditioned  nictitating  membrane  response  shown  in  Figure  4.  For 
ISI’s  of  fewer  than  2  time  units  in  the  numerical  algorithm,  competition  from  the  US 
representation  prevented  CS  activity  from  staying  above  the  Scs  — ►  D  pathway’s  threshold 
long  enough  to  appreciably  increase  the  pathway’s  strength  while  D  was  activated  by  the 
US.  At  long  ISI’s,  the  prior  decay  of  the  CS’s  short  term  memory  trace  prevented  the 
Scs  D  pathway  from  sensing  the  later  activation  of  D  by  the  US. 

Figure  6 

In  the  blocking  simulation  (Figures  7a-7d),  pairing  of  CS\  with  a  delayed  US  enabled 
the  long  term  memory  trace  of  the  CS\  — ►  D  pathway  to  achieve  an  S-shaped  cumulative 
learning  curve.  After  CS\  had  become  a  conditioned  reinforcer,  it  enhanced  its  own  short 
term  memory  storage  by  generating  a  large  ScSi  —*  D  —*  ScSi  feedback  signal.  As  & 
result,  when  CSi  and  CS2  were  simultaneously  presented,  the  short  term  memory  activity 
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of  $c5a  WM  quickly  suppressed  by  competition  from  CSX.  Consequently,  the  long  term 
memory  Scst  D  pathway  did  not  grow  in  strength,  preventing  the  CS%  from  being  a 
conditioned  reinforcer  or  eliciting  a  CR. 

Figure  7 

4.  Comparison  with  Aplysia  Conditioning  Model 

An  alternative  explanation  of  blocking,  due  to  Hawkins  and  Kandel  (1984),  involved 
habituation  of  transmitter  pathways.  Based  on  invertebrate  evidence,  they  developed 
a  model  whereby  each  US  activates  a  facilitator  neuron  that  presynaptically  modulates 
CS  pathways.  They  explain  blocking  (p.385)  by  saying  that  “the  output  of  the  facilitator 
neurons  decreases  when  they  are  stimulated  continuously” .  Thus  after  a  CS\  is  paired  with 
a  US  on  a  number  of  trials,  subsequent  presentation  of  a  compound  stimulus  CS\  +  CSj 
with  a  US  does  not  condition  C  Sj  because  the  facilitator  neuron  cannot  fire  adequately. 
Hawkins  and  Handel’s  explanation,  however,  is  incompatible  with  the  fact  (Kamin,  1969) 
that  blocking  can  be  overcome  (“unblocked")  if  CSi  +  CSt  is  paired  with  either  a  higher 
or  lower  intensity  of  shock  than  CSi  alone.  Recent  evidence  (Matzel  ct  al..  1985)  indicates 
that  unblocking  can  also  occur  if  the  response  to  CSi  is  extinguished. 

In  our  framework,  the  explanation  for  unblocking  depends  on  gated  dipole  opponent 
processes  that  link  together  “positive”  and  “negative”  drive  representations  (Figure  8). 
Positive  and  negative  channels  allow  for  a  comparison  between  current  and  expected  levels 
of  positive  or  negative  reinforcement.  The  more  complete  theory  of  Grossberg  (1982,  1987) 
which  includes  gated  dipoles  has  explained  such  unblocking  results  quantitatively. 

Figure  8 

In  the  remainder  of  the  article,  some  of  our  computer  simulation  results  using  gated 
dipoles  are  summarized.  A  more  systematic  development  is  provided  in  Grossberg  and 
Schmajuk  (1987).  Such  gated  dipoles  are  needed  because,  in  the  cognitive-reinforcement 
circuit,  CS’s  are  conditioned  to  either  the  onset  or  the  offset  of  a  reinforcer.  In  order  to 
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explain  how  the  offset  of  a  reinforcer  can  generate  an  antagonistic  rebound  to  which  a 
simultaneous  CS  can  be  conditioned,  gated  dipoles  were  introduced  by  Grossberg  (1972). 
A  gated  dipole  is  a  minimal  neural  network  which  is  capable  of  generating  a  sustained, 
but  habituative,  on-response  to  onset  of  a  cue,  as  well  as  a  transient  off-response,  or 
antagonistic  rebound,  to  offset  of  the  cue. 

5.  The  READ  Circuit:  A  Synthesis  of  Opponent  Processing  and  Associative 
Learning  Mechanisms 

Although  several  varieties  of  a  gated  dipole  circuit  can  describe  the  association  between 
a  CS  with  the  onset  and  the  offset  of  a  reinforcer,  a  specialized  gated  dipole  is  needed 
to  explain  secondary  inhibitory  conditioning.  Secondary  inhibitory  conditioning  consists 
of  two  phases.  In  phase  one,  CS2  becomes  an  excitatory  conditioned  reinforcer  (e.g., 
source  of  conditioned  fear)  by  being  paired  with  a  US  (e.g.,  a  shock).  In  phase  two, 
the  offset  of  CS\  can  generate  an  off-response  which  can  condition  a  subsequent  CS2  to 
become  an  inhibitory  conditioned  reinforcer  (e.g.,  source  of  conditioned  relief).  In  order  to 
explain  secondary  inhibitory  conditioning,  a  gated  dipole  circuit  must  also  contain  internal 
feedback  pathways,  i.e.,  it  should  be  recurrent.  In  addition,  such  a  recurrent  gated  dipole 
must  be  joined  to  a  mechanism  of  associative  learning.  The  total  circuit  that  we  have 
analyzed  is  called  a  READ  circuit,  as  a  mnemonic  for  REcurrent  Associative  gated  Dipole 
(Figure  9). 

Figure  9 

The  equations  for  the  READ  circuit  are  as  follows: 

Arousal  -+•  US  +  Feedback  On-Activation: 

^x\  =  -A\xx  +  I  +  J  +  T{xj)  (1) 

Arousal  +  Feedback  Off- Activation: 

~xj  =  -A2x3  +  I  +  T{xt) 


(2) 
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On- Transmitter: 

^Vi  =  B{  1  -  yi )  -  Cyfcijy!  (3) 

Off- Transmitter: 

jtVi  =  5(1  -  y3)  -  Cg{x2)y2  (4) 

Gated  On- Activation: 

=  -A3x3  +  Dg(x  t)yi  (5) 


Gated  Off-Activation: 

^x4  =  -A4z4  +  Dg{z2)y2 
Normalized  Opponent  On- Activation: 

^Z5  =  -A5X5  +  (£  -  Xs)z3  -  (xs  4-  F)x\ 
Normalized  Opponent  Off- Activation: 

j^xs  =  -A6x6  +  {E-  x6)x4  -  (x6  +  F)x3 


(0) 


(7) 


(8) 


Total  On-Activation: 


d  n 

T7®7  =  -A7X7  +  G[xs]  +  +  L  Skzk7 

k=l 


(U) 


Total  Off-Activation: 


d  * 

-3-x8  =  -A8xs  +  G[xe]+  +  L  53  $kzk e 
ai  k=\ 

On- Conditioned  Reinforcer  Association: 

~zk7  =  Sk[-Hzk7  +  K[xs\+] 
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(12) 


(13) 
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Off- Conditioned  Reinforcer  Association: 

±-tzM  =  Sk[-HzM  +  K[x<]+]  (14) 

On* Output  Signal: 

Ox  =  (z5]+  (15) 

Off* Output  Signal: 

Oj  =  [z«]  +  ,  (16) 

where  the  notation  [x,]  +  denotes  a  linear  signal  above  the  threshold  value  zero;  that  is, 

max(xt,0). 

In  the  equations,  I  denotes  the  tonic  arousal  level,  J  the  US  input,  S*  the  fc**  CS, 
2*7  and  2*s  the  association  of  the  kth  CS  with  the  on-  and  the  off-response,  respectively. 
A,B,C,D,E,F,G,H,K ,  and  L  are  parameter  values,  which  were  kept  constant  for  all 
simulations.  When  E  =  F,  zs  and  r«  compute  an  opponent  process  and  a  ratio  scale  at 
the  same  time.  Thus  one  key  property  of  the  READ  circuit  is  associative  averaging,  rather 
than  summation. 

0.  Opponent  Extinction  by  Dissociating  Long  Term  Memory  Read-In  and 
Read-Out  at  Dendritic  Spines 

A  second  key  property  of  the  READ  circuit  has  been  called  opponent  extinction.  Al¬ 
though  passive  memory  decay  does  not  occur  in  the  parameter  ranges  which  we  used,  when 
the  net  signals  in  the  on-  and  off  channels  are  balanced,  then  x5  =  0  =  x®,  and  therefore 
2*7  and  2*8  approach  0.  The  L7M  ’races  hereby  continually  readjust  themselves  to  the 
net  imbalance  between  the  or.  ar..i  iff  channels.  Opponent  extinction  avoids  the  possible 
saturation  at  maximal  values  >(  *»  *r.  L7M  traces  2*7  and  2* g. 

A  third  key  property  of  the  R L  KD  circuit  is  a  dissociation  between  read-in  and  read¬ 
out  of  long-term  memory  (LTM  as  r.  F  jure  10.  For  example,  in  the  on-channel,  read-out 
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is  proportional  to  [xj]+,  whereas  read-in  is  proportional  to  [xs]+.  Grossberg  (1975)  pro¬ 
posed  that  such  dissociation  can  be  physiologically  implemented  by  assuming  that  synaptic 
plasticity  occurs  at  the  dendritic  spines  of  neural  cells.  Signal  [x5]+  is  assumed  to  cause 
a  global  potential  change  that  invades  all  the  spines  inducing  plastic  changes  throughout 
the  dendritic  column,  as  in  equation  (13).  However,  due  to  the  geometry  and  electrical 
properties  of  the  dendritic  tree,  an  input  that  activates  a  particular  dendritic  branch  may 
not  be  influenced  by  inputs  that  activate  different  dendritic  branches.  Activation  at  a 
particular  dendritic  branch  would  produce  local  potentials  that  propagate  to  the  cell  body 
where  they  influence  axonal  firing  via  potential  xj  in  equation  (11). 

Figure  10 

7.  Computer  Simulations  of  Primary  and  Secondary  Conditioning 

This  section  summarizes  computer  simulations  in  different  classical  conditioning  pa¬ 
radigms.  Although  the  simulations  show  the  competence  of  the  READ  circuit  in  these 
paradigms,  additional  neural  machinery  (such  as  the  ART  circuit  in  Figure  1)  is  necessary 
to  explain  some  difficult  conditioning  data. 

Excitatory  primary  conditioning.  Because  the  CS  is  presented  in  the  presence  of  the 
US,  it  becomes  associated  with  the  on-response.  Variable  C5pON  describes  conditioning 
of  the  LTM  trace  zyi  within  the  pathway  from  the  sensory  representation  of  CS\  to  the  on- 
channel.  After  10  acquisition  trials,  presentations  of  CS\  alone  do  not  cause  extinction  of 
the  CSpON  association  (Figure  11).  As  explained  later  in  the  text,  forgetting  of  CSj-ON 
associations  is  due  to  the  acquisition  of  CSi-OFF  associations. 

Figure  11 

Inhibitory  primary  conditioning.  Because  the  CS  is  presented  after  the  US  offset,  it 
becomes  associated  with  the  off-response.  Variable  CSi-OFF  describes  conditioning  of 
the  LTM  trace  z 18  within  the  pathway  from  the  sensory  representation  of  CS\  to  the 
off-channel.  After  10  acquisition  trials,  presentations  of  CS\  alone  cause  the  C 5rOFF 
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association  to  relax  to  a  persistent  remembered  value  (Figure  12).  As  explained  later 
in  the  text,  forgetting  of  the  CSX- OFF  association  is  due  to  the  acquisition  of  CSrON 
associations. 

Figure  12 

In  Grossberg  and  Schmajuk  (1987),  the  following  types  of  secondary  conditioning  phe¬ 
nomena  are  also  simulated: 

Excitatory  secondary  conditioning.  The  LTM  trace  C5i-ON  grows  during  the  first  10 
trials  and  is  then  used  to  induce  the  growth  of  the  LTM  trace  CSj- ON  during  the  next  10 
trials. 

Inhibitory  secondary  conditioning.  The  LTM  trace  C5rON  grows  during  the  first  10 
trials  and  is  then  used,  by  presenting  a  CSj  after  CS\  offset,  to  induce  the  growth  of  the 
LTM  trace  CSj-OFF  during  the  next  10  trials. 

8.  Qualitative  Explanations  of  Extinction  and  Non-Extinction  Data 

This  section  presents  qualitative  explanations  for  some  difficult  conditioning  data  that 
require  additional  neural  machinery,  such  as  STM  attentional  modulation  and  STM  reset 
by  expectancy  mismatch  by  an  ART  circuit. 

Excitatory  conditioning  and  extinction.  When  a  CS  is  paired  with  an  aversive  US  on 
successive  conditioning  trials,  the  sensory  representation  Sx  of  CS\  is  conditioned  to  the 
drive  representation  Don  corresponding  to  the  fear  reaction,  both  through  its  conditioned 
reinforcer  path  S\  — *•  17M  and  through  its  incentive  motivational  path  Don  — *  Sx.  As  a 
result,  later  presentations  of  CS\  tend  to  generate  an  amplified  STM  activation  of  Sx,  and 
thus  CS\  is  preferentially  attended.  Due  to  the  limited  capacity  of  STM  less  salient  cues 
tend  to  be  attentionally  blocked  when  CSj  is  presented. 

As  the  cognitive-motivational  feedback  loop  S\  —*■  Don  — ♦  Sx  is  strengthened  during 
conditioning  trials,  Sx  is  also  associated  to  a  sensory  expectation  of  the  shock  within  an 
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ART  circuit.  During  extinction,  Si  is  presented  on  unshocked  trials.  Parameters  of  the 
READ  cicuit  are  chosen  to  prevent  passive  decay  of  LTM  traces  from  occurring  on  these 
trials.  However,  when  the  expected  shock  does  not  occur,  a  mismatch  occurs  with  the 
learned  expectation  read-out  by  Si,  the  STM  activity  of  Sy  is  reduced  by  the  consequent 
STM  reset,  and  an  antagonistic  rebound  occurs  in  the  off-channel  of  the  READ  circuit. 
Consequently,  Si  is  associated  to  an  antagonistic  rebound  at  A>//*  Because  Si  is  smaller 
after  reset  than  before,  Si  — <■  D0ff  associations  take  place  at  a  slower  rate  than  during 
conditioning.  After  several  learning  trials,  however,  the  pathway  Si  — ►  D0jj  is  as  strong 
as  the  Si  —*  Am  pathway,  and  opponent  extinction  occurs. 

Inhibitory  conditioning  and  non-extinction.  Suppose  that  CS i  has  become  a  condi¬ 
tioned  excitor,  and  that  CSi  and  CS2  are  presented  together  in  absence  of  the  US.  When 
CSi  and  CS)  are  simultaneously  presented  (Figure  13),  Si’s  activity  is  amplified  by  posi¬ 
tive  feedback  through  the  strong  conditioned  Si  — ►  Am  -*■  Si  pathway.  As  a  result  of  the 
limited  capacity  of  STM,  the  STM  activity  of  Sj  is  blocked  at  time  Ti.  When  the  expected 
US  does  not  occur  at  time  I),  the  mismatch  with  Si’s  sensory  expectation  causes  both  Si 
and  S2  to  be  reset,  and  Si’s  STM  activity  decreases  while  S2’s  STM  activity  increases.  Due 
to  Si’s  decrease,  a  rebound  occurs  at  D0/f.  Consequently,  the  unexpected  nonoccurrence 
of  the  shock  enables  Sj  to  become  associated  with  D0ff  in  both  the  pathways  S2  — *■  A// 
and  Doff  —*  Sj.  These  are  the  primary  cognitive-motivational  conditioning  events  that 
turn  CS2  into  a  conditioned  inhibitor. 

Figure  13 

According  to  the  READ  circuit,  when  presented  alone  the  conditioned  value  of  CS]  — ► 
Doff  persists.  No  further  extinction  occurs  because  the  CSj  sensory  expectation  predicts 
the  absence  of  the  US.  Thus  when  presented  alone,  CS)  does  not  disconfirm  its  sensory 
expectation,  and  Sj‘s  STM  activity  is  not  reset. 


9.  Conclusion 


At  least  four  types  of  learning  processes  are  relevant  in  the  present  paper:  learning 
of  conditioned  reinforcement,  incentive  motivation,  sensory  expectancy,  and  motor  com¬ 
mand.  These  several  types  of  learning  processes,  which  operate  on  a  slow  time  scale, 
regulate  and  are  regulated  by  rapidly  fluctuating  limited  capacity  STM  representations  of 
sensory  events.  The  theory  suggest  how  nonlinear  feedback  interactions  among  these  fast 
information  processing  mechanisms  and  slow  learning  mechanisms  participate  in  different 
conditioning  paradigms,  and  actively  regulate  learning  and  memory  to  generate  predictive 
internal  representations  of  external  environmental  contingencies. 
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FIGURE  CAPTIONS 


Figure  1.  Anatomy  of  an  adaptive  resonance  theory  (ART)  circuit:  (a)  Interactions 
between  the  attentionai  and  orienting  subsystems.  Code  learning  takes  place  at  the  long 
term  memory  (LTM)  traces  within  the  bottom-up  and  top-down  pathways  between  levels 
F\  and  fj.  The  top-down  pathways  can  read-out  learned  expectations,  or  templates,  that 
are  matched  against  bottom-up  input  patterns  at  Fi .  Mismatches  activate  the  orienting 
subsystem  A,  thereby  resetting  short  term  memory  (STM)  at  F j  and  initiating  search  for 
another  recognition  code.  Subsystem  A  can  also  activate  an  orienting  response.  Sensitivity 
to  mismatch  at  F\  is  modulated  by  vigilance  signals  from  drive  representations,  (b)  Train- 
able  pathways  exist  between  level  Fj  and  the  drive  representations.  Learning  from  F2  to  a 
drive  representation  endows  a  recognition  category  with  conditioned  reinforcer  properties. 
Learning  from  a  drive  representation  to  F2  associates  the  drive  representation  with  a  set 
of  motivationally  compatible  categories.  (Adapted  from  Carpenter  and  Grossberg,  1987c.) 

Figure  3.  Schematic  conditioning  circuit:  Conditioned  stimuli  (C5t)  activate  sensory 
representations  (SCfJ  which  compete  among  themselves  for  limited  capacity  short  term 
memory  activation  and  storage.  The  activated  5c,  elicit  conditioned  signals  to  drive 
representations  and  motor  command  representations.  Learning  from  an  Seii  to  a  drive 
representation  D  is  called  conditioned  reinforcer  learning.  Learning  from  D  to  5CJ%  is  called 
incentive  motivational  learning.  Signals  from  D  to  are  elicited  when  the  combination  of 

external  sensory  plus  internal  drive  inputs  is  sufficiently  large.  In  the  simulations  reported 
herein,  the  drive  level  is  assumed  to  be  large  and  constant. 

Figure  3.  A  blocking  paradigm.  The  two  stages  of  the  experiment  are  discussed  in 
the  text. 

Figure  4.  Experimental  relationship  between  conditioned  response  strength  (mea¬ 
sured  by  percentage  of  trials  on  which  response  occurs)  and  interstimulus  interval  in  the 
rabbit  nictitating  membrane  response.  (Reprinted  with  permission  from  Sutton  and  Barto, 
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1981.) 

Figure  5.  Simulated  network:  Each  sensory  representation  possesses  two  stages  with 
STM  activities  x,i  and  x,a.  A  CS  or  US  input  activates  its  corresponding  x^-  Activation 
of  xtl  elicits  unconditionable  signals  to  x,a  and  conditioned  reinforcer  signals  to  D,  whose 
activity  is  denoted  by  y.  Incentive  motivational  feedback  signals  from  D  activate  the 
second  stage  potentials  xi2 ,  which  then  send  feedback  signals  to  xtl.  Conditionable  long¬ 
term  memory  traces  are  designated  by  hemi-disks. 

Figure  <J.  Plot  of  CR  acquisition  speed  as  a  function  of  IS I.  This  speed  was  computed 
by  the  formula  100  x  (number  of  time  units  per  trial) /(number  of  time  units  to  first  CR). 

Figure  7.  Blocking  simulation:  In  (a)-(d),  the  ISI  =  6  between  CSX  and  US  onset. 
Five  trials  of  CSj-US  pairing  are  followed  by  five  trials  of  ( CSX  +  CSa)-US  pairing.  Then 
CS2  is  presented  alone  for  one  trial,  (a)  Activity  xn  of  Scst  through  time;  (b)  Activity 
xn  of  5csa  through  time;  (c)  LTM  trace  *u  from  Scsx  to  D  through  time;  (d)  LTM  trace 
*2i  from  Scs3  to  D  through  time. 

Figure  8.  Example  of  a  feedforward  gated  dipole:  A  sustained  habituating  on-response 
(top  left)  and  a  transient  off-rebound  (top  right)  are  elicited  in  response  to  onset  and 
offset,  respectively,  of  a  phasic  input  J  (bottom  left)  when  tonic  arousal  I  (bottom  center) 
and  opponent  processing  (diagonal  pathways)  supplement  the  slow  gating  actions  (square 
synapses).  See  text  for  details. 

Figure  9.  A  READ  I  circuit:  This  circuit  joins  together  a  recurrent  gated  dipole 
with  an  associative  learning  mechanism.  Learning  is  driven  by  signals  S*  from  sensory 
representations  S*  which  activate  long  term  memory  (LTM)  traces  x* ^  and  that  sample 
activation  levels  at  the  on-channel  and  off-channel,  respectively,  of  the  gate  dipole.  See 
text  for  details. 

Figure  10.  A  possible  microarchitecture  for  dissociation  of  LTM  read-in  and  read-out: 
Individual  LTM-gated  sensory  signals  are  read-out  into  local  potentials  which  are 
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summed  by  the  total  cell  body  potential  x7  without  significantly  influencing  each  other’s 
learned  read-in.  In  contrast,  the  input  signal  x5  triggers  a  massive  global  cell  activation 
which  drives  learned  read- in  at  all  active  LTM  traces  abutting  the  cell  surface.  Signal  x$ 
also  activates  the  cell  body  potential  z7. 

Figure  11.  Computer  simulation  of  primary  excitatory  conditioning  and  extinction 
with  slow  habituation  and  large  feedback  in  a  READ  I  circuit:  CSi  is  paired  with  the  US 
during  the  first  10  simulated  trials,  and  C Sx  is  presented  in  the  absence  of  the  US  in  the 
next  10  simulated  trials.  The  numbers  above  each  plot  are  the  maximum  and  minimum 
values  of  the  plot.  Parameters  are  A  =  l,  B  =  .005,  C  =  .00125,  D  =  20,  E  =  20,  F  = 
20,  G  =  .5,  H  =  .005,  K  =  .025,  L  =  20,  M  =  .05. 

Figure  12.  Computer  simulation  of  primary  inhibitory  conditioning  and  extinction 
with  slow  habituation  and  large  feedback  in  a  READ  I  circuit:  CS\  is  presented  after  the 
US  offset  during  the  first  10  simulated  trials,  and  CS\  is  presented  in  the  absence  of  the 
US  in  the  next  10  simulated  trials.  The  same  parameters  were  used  as  in  Figure  11. 

Figure  IS.  Presentation  of  CS i  and  CSj  when  CS i  has  become  a  conditioned  excitor 
and  the  compound  stimulus  is  followed  by  no-shock:  During  the  no-shock  interval  between 
times  T\  and  Tj,  S\  is  actively  amplified  by  positive  feedback  and  Sj  is  blocked.  Nonoccur¬ 
rence  of  the  expected  shock  causes  both  Si  and  Sj  to  be  reset.  Si’s  STM  activity  decreases 
and  Sj’s  STM  activity  increases  Due  to  Si’s  increase,  Don  also  decreases,  thereby  causing 
a  rebound  at  D0ff.  This  rebound  becomes  associated  with  the  increased  activity  of  Sj. 
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